{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Offline Translation\n",
    "Authors:\n",
    "- Alexander Wolf\n",
    "\n",
    "created_at: 11/04/2017   \n",
    "last_commit: 03/05/2018\n",
    "\n",
    "### TL;DR \n",
    "\n",
    "This is a notebook showcasing how you can make a translator via the [Tensor2Tensor](https://github.com/tensorflow/tensor2tensor) library. The Tensor2Tensor library was made to be used with shell scripts in the terminal; however, this notebook reversed engineered the library a bit to allow you to train T2T models in code. The API supplies datasets and code to make translators in the following languages.\n",
    "\n",
    "1. French\n",
    "2. German\n",
    "3. Chinese\n",
    "4. Chzech\n",
    "5. Macedonian\n",
    "\n",
    "Tensor2Tensor consists of three main components to train advanced neural networks with minimal code.\n",
    "\n",
    "1. Problems\n",
    "\t- This specifies what type of input modality the network will have and what your task you're trying to solve. \n",
    "2. Model\n",
    "\t- T2T consist of many state of the art models for translation and other tasks, which are independent of the type of input/output modality. For translation tasks it is recommended to use the *Transformer*. \n",
    "3. Hyperparameter Sets\n",
    "    - These are predefined hparam sets to train a model for a specified T2T problem. They can be modified and changed also. \n",
    "    \n",
    "If you want to use one of Tensor2Tensor's precoded model/ framework for training a Neural Net on one of your own datasets and/or a new problem type, it is possible to extend the library yourself. See [here](https://github.com/tensorflow/tensor2tensor/blob/0d464ff699029cd604fdb11fb532f64bebc9134c/docs/new_problem.md) for more info.   "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Import needed Libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "\n",
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Get Tensorflow Flags"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Flags for Tensorflow 1.5 +\n",
    "FLAGS = tf.flags\n",
    "\n",
    "# Flags for Tensorflow < 1.5\n",
    "# FLAGS = tf.flags.FLAGS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Import needed T2T Classes and Fucntions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from tensor2tensor.utils import trainer_lib\n",
    "from tensor2tensor.utils.trainer_lib import create_run_config, create_experiment, create_hparams\n",
    "from tensor2tensor.utils import registry\n",
    "from tensor2tensor import models, problems"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Initialize Notebook Params"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS = {}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define T2T Problem, Model, and HPARAMS\n",
    "These are Tensor2Tensor parameters which will be used to make the correct pipeline for training, evaluating, and decoding your T2T model properly. Read more below to see how to set them. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Problem\n",
    "\n",
    "A Tensor2Tensor translation problem specifies what language you're translating from and to, along with the vocabulary source and size for your translation model. This will be needed to be set for data collection, preparation, and training your model. Add '_rev' to the end of the problem name to translate from a language to english instead of english to the other selected language. \n",
    "\n",
    "In Tensor2Tensor the format of a translation problem name goes as follows\n",
    "- 'translate_en' + OTHER_LANGAUGE_CODE + 'wmt' + VOCAB_SIZE/TYPE *+ (optional) + '_rev'*\n",
    "\n",
    "You can view all the available problems in T2T by executing the code below. It is possible to add your own problem to T2T also if it is not already built in. See [here](https://github.com/tensorflow/tensor2tensor/blob/master/docs/new_problem.md) for more details\n",
    "\n",
    "```python\n",
    "from tensor2tensor import problems\n",
    "problems.available()\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "PARAMS['T2T_Problem'] = 'translate_enfr_wmt32k_rev' # T2T Problem name for French -> English model with 32k word vocab size"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Model\n",
    "\n",
    "Tensor2Tensor has a lot of prebuilt code for neural sequential data processing. For translation, it is advised to use the *'transformer'* model; it is able to train very quickly compared to other models like RNNS, and converges well.\n",
    "\n",
    "To see all available T2T models execute the code below; Note not all of these can be used for translation tasks\n",
    "```python\n",
    "from tensor2tensor.utils import registry\n",
    "registry.list_models()\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "PARAMS['T2T_Model'] = 'transformer' # Setting the Transformer Model for Translation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### HPARAM Set\n",
    "\n",
    "Tensor2Tensor contains different prebuilt hyperparameter sets for defining the number of hidden layers in the model, learning rates, regularization terms, and much more. They are already preconfigured to help you produce state of the art models, but can be modified to how ever you want within the notebook later.\n",
    "\n",
    "For translation if using 1 GPU, it is advised to start off with hparam set *transformer_base*. To view all prebuilt hparams sets for your model you can look at the Tensor2Tensor source code: For example at the bottom of this [py file](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py) you can find all the Transformer hparam sets."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "PARAMS['T2T_HPARAMS'] = 'transformer_base' # Setting the Transformer Model for Translation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generate Data for Problem\n",
    "Tensor2Tensor supplies data generators for each problem in its library. You will need to generate the data into a *Data_Dir* only once. This process can take many hours so by default the function which generates the data is commented out.\n",
    "\n",
    "Make sure to import the correct data generator for the problem your using in the T2T library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from tensor2tensor.data_generators.translate_enfr import TranslateEnfrWmt32k"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set Temp and Data  Directory\n",
    "Set these two directories below.\n",
    "- *TEMP_DIR*\n",
    "\t- Stores downloaded zip files of text data from internet\n",
    "\t- If internally using at Dataiku on dku11 for any translation problem in T2T; keep the *TMP_DIR* below\n",
    "        - Data is already downloaded for you \n",
    "- *DATA_DIR*\n",
    "\t- This directory precompiles the data for your specific T2T problem\n",
    "\t- You will need to set a new one for each new translation task\n",
    "    - Can be reused in training again if the same problem"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS['TMP_DIR'] = '/data.nfs/shared/translation_t2t_data'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS['DATA_DIR'] = '/data.nfs/awolf/translation/enfr_rev_train_data/translate_fr_en_wmt32k' "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "Now Generate the Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "#TranslateEnfrWmt32k.generate_data(PARAMS['DATA_DIR'], PARAMS['TMP_DIR']) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Set Train Directory\n",
    "You need to set a Train Dir which Tensor2Tensor will use to store the model checkpoints in. This can be used to load models and continue training from those checkpoints if stopped. The checkpoint files can get very large, because there are so many parameters in the model, so keep that in mind when selecting a directory.\n",
    "\n",
    "Tensor2Tensor will *overwrite* model checkpoints stored in this directory, so if you don't want your old checkpoints erased, make sure to change this directory when training a model. \n",
    "\n",
    "Set the *PARAMS['keep\\_checkpoint_max']* parameter below to let T2T know how many checkpoints to keep in the *TRAIN_DIR* before overwriting the oldest one."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS['TRAIN_DIR'] = '/data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train' "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Get TF Hparams Object\n",
    "Tensor2Tensor creates a [Tensorflow Hparam](https://www.tensorflow.org/api_docs/python/tf/contrib/training/HParams) object for training. IF you set the *T2T_HPARAMS* value already, the object will be made for you automatically below, and changes can be made manually."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "hparams = create_hparams(PARAMS['T2T_HPARAMS'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Make Any Hyperparameter Changes\n",
    "View your hparams below and make any changes as needed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "hparams.batch_size = 1024\n",
    "hparams.learning_rate_warmup_steps = 45000\n",
    "hparams.learning_rate = .4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "View all Hparams for Training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{u'attention_dropout': 0.1,\n",
       " u'attention_dropout_broadcast_dims': u'',\n",
       " u'attention_key_channels': 0,\n",
       " u'attention_value_channels': 0,\n",
       " u'batch_size': 1024,\n",
       " u'clip_grad_norm': 0.0,\n",
       " u'compress_steps': 0,\n",
       " u'daisy_chain_variables': True,\n",
       " u'dropout': 0.2,\n",
       " u'eval_drop_long_sequences': False,\n",
       " u'eval_run_autoregressive': False,\n",
       " u'factored_logits': False,\n",
       " u'ffn_layer': u'dense_relu_dense',\n",
       " u'filter_size': 2048,\n",
       " u'force_full_predict': False,\n",
       " u'grad_noise_scale': 0.0,\n",
       " u'hidden_size': 512,\n",
       " u'initializer': u'uniform_unit_scaling',\n",
       " u'initializer_gain': 1.0,\n",
       " u'input_modalities': u'default',\n",
       " u'kernel_height': 3,\n",
       " u'kernel_width': 1,\n",
       " u'label_smoothing': 0.1,\n",
       " u'layer_postprocess_sequence': u'da',\n",
       " u'layer_prepostprocess_dropout': 0.1,\n",
       " u'layer_prepostprocess_dropout_broadcast_dims': u'',\n",
       " u'layer_preprocess_sequence': u'n',\n",
       " u'learning_rate': 0.4,\n",
       " u'learning_rate_cosine_cycle_steps': 250000,\n",
       " u'learning_rate_decay_rate': 1.0,\n",
       " u'learning_rate_decay_scheme': u'none',\n",
       " u'learning_rate_decay_staircase': False,\n",
       " u'learning_rate_decay_steps': 5000,\n",
       " u'learning_rate_minimum': None,\n",
       " u'learning_rate_schedule': u'linear_warmup_rsqrt_decay',\n",
       " u'learning_rate_warmup_steps': 45000,\n",
       " u'length_bucket_step': 1.1,\n",
       " u'max_input_seq_length': 0,\n",
       " u'max_length': 256,\n",
       " u'max_relative_position': 0,\n",
       " u'max_target_seq_length': 0,\n",
       " u'min_length': 0,\n",
       " u'min_length_bucket': 8,\n",
       " u'moe_hidden_sizes': u'2048',\n",
       " u'moe_k': 2,\n",
       " u'moe_loss_coef': 0.01,\n",
       " u'moe_num_experts': 64,\n",
       " u'multiply_embedding_mode': u'sqrt_depth',\n",
       " u'nbr_decoder_problems': 1,\n",
       " u'no_data_parallelism': False,\n",
       " u'norm_epsilon': 1e-06,\n",
       " u'norm_type': u'layer',\n",
       " u'num_decoder_layers': 0,\n",
       " u'num_encoder_layers': 0,\n",
       " u'num_heads': 8,\n",
       " u'num_hidden_layers': 6,\n",
       " u'optimizer': u'Adam',\n",
       " u'optimizer_adam_beta1': 0.9,\n",
       " u'optimizer_adam_beta2': 0.997,\n",
       " u'optimizer_adam_epsilon': 1e-09,\n",
       " u'optimizer_momentum_momentum': 0.9,\n",
       " u'optimizer_momentum_nesterov': False,\n",
       " u'parameter_attention_key_channels': 0,\n",
       " u'parameter_attention_value_channels': 0,\n",
       " u'pos': u'timing',\n",
       " u'prepend_mode': u'none',\n",
       " u'problem_choice': u'adaptive',\n",
       " u'proximity_bias': False,\n",
       " u'relu_dropout': 0.1,\n",
       " u'relu_dropout_broadcast_dims': u'',\n",
       " u'sampling_method': u'argmax',\n",
       " u'sampling_temp': 1.0,\n",
       " u'scheduled_sampling_gold_mixin_prob': 0.5,\n",
       " u'scheduled_sampling_prob': 0.0,\n",
       " u'scheduled_sampling_warmup_steps': 50000,\n",
       " u'self_attention_type': u'dot_product',\n",
       " u'shared_embedding_and_softmax_weights': True,\n",
       " u'split_to_length': 0,\n",
       " u'summarize_grads': False,\n",
       " u'summarize_vars': False,\n",
       " u'symbol_dropout': 0.0,\n",
       " u'symbol_modality_num_shards': 16,\n",
       " u'symbol_modality_skip_top': False,\n",
       " u'target_modality': u'default',\n",
       " u'use_fixed_batch_size': False,\n",
       " u'use_pad_remover': True,\n",
       " u'weight_decay': 0.0,\n",
       " u'weight_noise': 0.0}"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "json.loads(hparams.to_json())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set Options for Training"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Keep these flags as they are"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "FLAGS.problems = PARAMS['T2T_Problem']\n",
    "FLAGS.model = PARAMS['T2T_Model']\n",
    "FLAGS.schedule = \"train_and_evaluate\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Train and Eval Steps\n",
    "Set how many steps to train the model for; It is easier to set a high amount of train_steps and then stop training when you reach a desired performance.\n",
    "\n",
    "Each time a checkpoint is saved the model will run evaluation for *X* amount of steps. 100 is a good default. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS['train_steps'] = 1000000\n",
    "PARAMS['eval_steps'] = 100"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### How often to save Checkpoint/Evaluate Model\n",
    "\n",
    "Use the flags below to control how often to save a model checkpoint. Note after a checkpoint is saved evaluation happens. You can control this by\n",
    "\n",
    "1. Steps\n",
    "\t- set *FLAGS.local\\_eval_frequency* to control how many training steps to take before training/ evaluating.\n",
    "\t- FLAGS.save\\_checkpoints_secs must be set to 0\n",
    "2. Time\n",
    "\t- Set FLAGS.save\\_checkpoints_secs to save a checkpoint and evaluate every x seconds while training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "FLAGS.save_checkpoints_secs = 0\n",
    "FLAGS.local_eval_frequency = 2000\n",
    "\n",
    "PARAMS['local_eval_frequency'] = FLAGS.local_eval_frequency\n",
    "PARAMS['save_checkpoints_secs'] = FLAGS.save_checkpoints_secs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Control how many Checkpoints to Save in the *train_dir*\n",
    "\n",
    "Set the parameter *PARAMS['keep_checkpoint_max']* below to control how many model checkpoints are kept in the *train_dir*; The oldest ckpt will be over written if this number is reached during training.\n",
    "\n",
    "\n",
    "These model ckpt files can get very large, so take that in mind when choosing the amount to keep at once. The Tensorboard logs for all eval/train steps will be kept in the *train_dir* regardless of this parameter. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PARAMS['keep_checkpoint_max'] = 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### GPU/ Distributed Training Flags\n",
    "- Use these to control single and multi GPU support\n",
    "- For more info see Tensor2Tensor's [documentation](https://github.com/tensorflow/tensor2tensor/blob/master/docs/distributed_training.md)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"1\" # Set this to control which GPU is visable during training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "FLAGS.gpu_memory_fraction = .99\n",
    "FLAGS.worker_gpu = 1\n",
    "FLAGS.ps_gpu = 2\n",
    "FLAGS.log_device_placement = True\n",
    "FLAGS.worker_replicas = 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Tensorflow Experiment Object\n",
    "Tensor2Tensor does the training with Tensorflow by creating an [experiment](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment) for you. \n",
    "\n",
    "First thing we need to do is create the run_config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:schedule=continuous_train_and_eval\n",
      "INFO:tensorflow:worker_gpu=1\n",
      "INFO:tensorflow:sync=False\n",
      "WARNING:tensorflow:Schedule=continuous_train_and_eval. Assuming that training is running on a single machine.\n",
      "INFO:tensorflow:datashard_devices: ['gpu:0']\n",
      "INFO:tensorflow:caching_devices: None\n",
      "INFO:tensorflow:ps_devices: ['gpu:0']\n"
     ]
    }
   ],
   "source": [
    "RUN_CONFIG = create_run_config(\n",
    "      model_dir=PARAMS['TRAIN_DIR'],\n",
    "      keep_checkpoint_max=PARAMS['keep_checkpoint_max'],\n",
    "      save_checkpoints_secs=PARAMS['save_checkpoints_secs'],\n",
    "      gpu_mem_fraction=FLAGS.gpu_memory_fraction\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Training the Model with Tensor2Tensor\n",
    "Use this dictionary below to refer to all the important variables for this models training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'DATA_DIR': '/data.nfs/awolf/translation/enfr_rev_train_data/translate_fr_en_wmt32k',\n",
       " 'T2T_HPARAMS': 'transformer_base',\n",
       " 'T2T_Model': 'transformer',\n",
       " 'T2T_Problem': 'translate_enfr_wmt32k_rev',\n",
       " 'TMP_DIR': '/data.nfs/shared/translation_t2t_data',\n",
       " 'TRAIN_DIR': '/data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train',\n",
       " 'eval_steps': 100,\n",
       " 'keep_checkpoint_max': 3,\n",
       " 'local_eval_frequency': 2000,\n",
       " 'save_checkpoints_secs': 0,\n",
       " 'train_steps': 1000000}"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "PARAMS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To run the training function twice without resetting the notebook kernel, you'll need to run the commented out code below before training for the second or more time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# del hparams.data_dir\n",
    "# del hparams.train_steps\n",
    "# del hparams.eval_steps"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check logs below to see progress of model loss and BLEU score"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO:tensorflow:Using config: {'_save_checkpoints_secs': None, '_keep_checkpoint_max': 3, '_task_type': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0xa29d0d0>, '_keep_checkpoint_every_n_hours': 10000, '_session_config': gpu_options {\n",
      "  per_process_gpu_memory_fraction: 0.99\n",
      "}\n",
      "allow_soft_placement: true\n",
      "graph_options {\n",
      "  optimizer_options {\n",
      "  }\n",
      "}\n",
      ", 'use_tpu': False, '_tf_random_seed': None, '_num_worker_replicas': 0, '_task_id': 0, 't2t_device_info': {'num_async_replicas': 1}, '_evaluation_master': '', '_log_step_count_steps': 100, '_num_ps_replicas': 0, '_is_chief': True, '_tf_config': gpu_options {\n",
      "  per_process_gpu_memory_fraction: 1.0\n",
      "}\n",
      ", '_save_checkpoints_steps': 1000, '_environment': 'local', '_master': '', '_model_dir': '/data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train', 'data_parallelism': <tensor2tensor.utils.expert_utils.Parallelism object at 0xa28cd10>, '_save_summary_steps': 100}\n",
      "WARNING:tensorflow:Estimator's model_fn (<function wrapping_model_fn at 0xa2909b0>) includes params argument, but params are not passed to Estimator.\n",
      "INFO:tensorflow:Using ValidationMonitor\n",
      "WARNING:tensorflow:From /data/dataiku/dss-data-dir/code-envs/python/Tensor2Tensor_GPU/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/monitors.py:267: __init__ (from tensorflow.contrib.learn.python.learn.monitors) is deprecated and will be removed after 2016-12-05.\n",
      "Instructions for updating:\n",
      "Monitors are deprecated. Please use tf.train.SessionRunHook.\n",
      "INFO:tensorflow:Reading data files from /data.nfs/awolf/translation/enfr_rev_train_data/translate_fr_en_wmt32k/translate_enfr_wmt32k-train*\n",
      "INFO:tensorflow:partition: 0 num_data_files: 100\n",
      "INFO:tensorflow:Setting T2TModel mode to 'train'\n",
      "INFO:tensorflow:Using variable initializer: uniform_unit_scaling\n",
      "INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_27664_512.bottom\n",
      "INFO:tensorflow:Transforming 'targets' with symbol_modality_27664_512.targets_bottom\n",
      "INFO:tensorflow:Building model body\n",
      "INFO:tensorflow:Transforming body output with symbol_modality_27664_512.top\n",
      "INFO:tensorflow:Base learning rate: 0.400000\n",
      "INFO:tensorflow:Trainable Variables Total size: 58284032\n",
      "INFO:tensorflow:Using optimizer Adam\n",
      "INFO:tensorflow:Create CheckpointSaverHook.\n",
      "INFO:tensorflow:Saving checkpoints for 1 into /data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train/model.ckpt.\n",
      "INFO:tensorflow:loss = 9.106932, step = 1\n",
      "INFO:tensorflow:global_step/sec: 3.43435\n",
      "INFO:tensorflow:loss = 8.324527, step = 101 (29.119 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78483\n",
      "INFO:tensorflow:loss = 7.803612, step = 201 (26.421 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.79454\n",
      "INFO:tensorflow:loss = 7.384148, step = 301 (26.355 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.79226\n",
      "INFO:tensorflow:loss = 6.955927, step = 401 (26.368 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78903\n",
      "INFO:tensorflow:loss = 6.3720956, step = 501 (26.392 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.80311\n",
      "INFO:tensorflow:loss = 6.4820685, step = 601 (26.294 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.8036\n",
      "INFO:tensorflow:loss = 5.8566775, step = 701 (26.291 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.79633\n",
      "INFO:tensorflow:loss = 5.17837, step = 801 (26.343 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.7925\n",
      "INFO:tensorflow:loss = 5.1035585, step = 901 (26.367 sec)\n",
      "INFO:tensorflow:Saving checkpoints for 1001 into /data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train/model.ckpt.\n",
      "INFO:tensorflow:global_step/sec: 2.64961\n",
      "INFO:tensorflow:loss = 4.7494783, step = 1001 (37.741 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.79912\n",
      "INFO:tensorflow:loss = 4.866171, step = 1101 (26.322 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78545\n",
      "INFO:tensorflow:loss = 4.4327636, step = 1201 (26.417 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78959\n",
      "INFO:tensorflow:loss = 4.5239463, step = 1301 (26.389 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78812\n",
      "INFO:tensorflow:loss = 4.509579, step = 1401 (26.397 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78717\n",
      "INFO:tensorflow:loss = 4.735831, step = 1501 (26.405 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.7777\n",
      "INFO:tensorflow:loss = 4.397319, step = 1601 (26.471 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.77413\n",
      "INFO:tensorflow:loss = 3.7875838, step = 1701 (26.497 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78586\n",
      "INFO:tensorflow:loss = 3.969005, step = 1801 (26.415 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78755\n",
      "INFO:tensorflow:loss = 4.2618384, step = 1901 (26.400 sec)\n",
      "INFO:tensorflow:Reading data files from /data.nfs/awolf/translation/enfr_rev_train_data/translate_fr_en_wmt32k/translate_enfr_wmt32k-dev*\n",
      "INFO:tensorflow:partition: 0 num_data_files: 1\n",
      "INFO:tensorflow:Setting T2TModel mode to 'eval'\n",
      "INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.symbol_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.attention_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.relu_dropout to 0.0\n",
      "INFO:tensorflow:Using variable initializer: uniform_unit_scaling\n",
      "INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_27664_512.bottom\n",
      "INFO:tensorflow:Transforming 'targets' with symbol_modality_27664_512.targets_bottom\n",
      "INFO:tensorflow:Building model body\n",
      "INFO:tensorflow:Transforming body output with symbol_modality_27664_512.top\n",
      "INFO:tensorflow:Starting evaluation at 2018-03-05-18:25:27\n",
      "INFO:tensorflow:Restoring parameters from /data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train/model.ckpt-1001\n",
      "INFO:tensorflow:Evaluation [1/100]\n",
      "INFO:tensorflow:Evaluation [2/100]\n",
      "INFO:tensorflow:Evaluation [3/100]\n",
      "INFO:tensorflow:Evaluation [4/100]\n",
      "INFO:tensorflow:Evaluation [5/100]\n",
      "INFO:tensorflow:Evaluation [6/100]\n",
      "INFO:tensorflow:Evaluation [7/100]\n",
      "INFO:tensorflow:Evaluation [8/100]\n",
      "INFO:tensorflow:Evaluation [9/100]\n",
      "INFO:tensorflow:Evaluation [10/100]\n",
      "INFO:tensorflow:Evaluation [11/100]\n",
      "INFO:tensorflow:Evaluation [12/100]\n",
      "INFO:tensorflow:Evaluation [13/100]\n",
      "INFO:tensorflow:Evaluation [14/100]\n",
      "INFO:tensorflow:Evaluation [15/100]\n",
      "INFO:tensorflow:Evaluation [16/100]\n",
      "INFO:tensorflow:Evaluation [17/100]\n",
      "INFO:tensorflow:Evaluation [18/100]\n",
      "INFO:tensorflow:Evaluation [19/100]\n",
      "INFO:tensorflow:Evaluation [20/100]\n",
      "INFO:tensorflow:Evaluation [21/100]\n",
      "INFO:tensorflow:Evaluation [22/100]\n",
      "INFO:tensorflow:Evaluation [23/100]\n",
      "INFO:tensorflow:Evaluation [24/100]\n",
      "INFO:tensorflow:Evaluation [25/100]\n",
      "INFO:tensorflow:Evaluation [26/100]\n",
      "INFO:tensorflow:Evaluation [27/100]\n",
      "INFO:tensorflow:Evaluation [28/100]\n",
      "INFO:tensorflow:Evaluation [29/100]\n",
      "INFO:tensorflow:Evaluation [30/100]\n",
      "INFO:tensorflow:Evaluation [31/100]\n",
      "INFO:tensorflow:Evaluation [32/100]\n",
      "INFO:tensorflow:Evaluation [33/100]\n",
      "INFO:tensorflow:Evaluation [34/100]\n",
      "INFO:tensorflow:Evaluation [35/100]\n",
      "INFO:tensorflow:Evaluation [36/100]\n",
      "INFO:tensorflow:Evaluation [37/100]\n",
      "INFO:tensorflow:Evaluation [38/100]\n",
      "INFO:tensorflow:Evaluation [39/100]\n",
      "INFO:tensorflow:Evaluation [40/100]\n",
      "INFO:tensorflow:Evaluation [41/100]\n",
      "INFO:tensorflow:Evaluation [42/100]\n",
      "INFO:tensorflow:Evaluation [43/100]\n",
      "INFO:tensorflow:Evaluation [44/100]\n",
      "INFO:tensorflow:Evaluation [45/100]\n",
      "INFO:tensorflow:Evaluation [46/100]\n",
      "INFO:tensorflow:Evaluation [47/100]\n",
      "INFO:tensorflow:Evaluation [48/100]\n",
      "INFO:tensorflow:Evaluation [49/100]\n",
      "INFO:tensorflow:Evaluation [50/100]\n",
      "INFO:tensorflow:Evaluation [51/100]\n",
      "INFO:tensorflow:Evaluation [52/100]\n",
      "INFO:tensorflow:Evaluation [53/100]\n",
      "INFO:tensorflow:Evaluation [54/100]\n",
      "INFO:tensorflow:Evaluation [55/100]\n",
      "INFO:tensorflow:Evaluation [56/100]\n",
      "INFO:tensorflow:Evaluation [57/100]\n",
      "INFO:tensorflow:Evaluation [58/100]\n",
      "INFO:tensorflow:Evaluation [59/100]\n",
      "INFO:tensorflow:Evaluation [60/100]\n",
      "INFO:tensorflow:Evaluation [61/100]\n",
      "INFO:tensorflow:Evaluation [62/100]\n",
      "INFO:tensorflow:Evaluation [63/100]\n",
      "INFO:tensorflow:Evaluation [64/100]\n",
      "INFO:tensorflow:Evaluation [65/100]\n",
      "INFO:tensorflow:Evaluation [66/100]\n",
      "INFO:tensorflow:Evaluation [67/100]\n",
      "INFO:tensorflow:Evaluation [68/100]\n",
      "INFO:tensorflow:Evaluation [69/100]\n",
      "INFO:tensorflow:Evaluation [70/100]\n",
      "INFO:tensorflow:Evaluation [71/100]\n",
      "INFO:tensorflow:Evaluation [72/100]\n",
      "INFO:tensorflow:Evaluation [73/100]\n",
      "INFO:tensorflow:Evaluation [74/100]\n",
      "INFO:tensorflow:Evaluation [75/100]\n",
      "INFO:tensorflow:Evaluation [76/100]\n",
      "INFO:tensorflow:Evaluation [77/100]\n",
      "INFO:tensorflow:Evaluation [78/100]\n",
      "INFO:tensorflow:Evaluation [79/100]\n",
      "INFO:tensorflow:Evaluation [80/100]\n",
      "INFO:tensorflow:Evaluation [81/100]\n",
      "INFO:tensorflow:Evaluation [82/100]\n",
      "INFO:tensorflow:Evaluation [83/100]\n",
      "INFO:tensorflow:Evaluation [84/100]\n",
      "INFO:tensorflow:Evaluation [85/100]\n",
      "INFO:tensorflow:Evaluation [86/100]\n",
      "INFO:tensorflow:Evaluation [87/100]\n",
      "INFO:tensorflow:Evaluation [88/100]\n",
      "INFO:tensorflow:Evaluation [89/100]\n",
      "INFO:tensorflow:Evaluation [90/100]\n",
      "INFO:tensorflow:Evaluation [91/100]\n",
      "INFO:tensorflow:Evaluation [92/100]\n",
      "INFO:tensorflow:Evaluation [93/100]\n",
      "INFO:tensorflow:Evaluation [94/100]\n",
      "INFO:tensorflow:Evaluation [95/100]\n",
      "INFO:tensorflow:Evaluation [96/100]\n",
      "INFO:tensorflow:Evaluation [97/100]\n",
      "INFO:tensorflow:Evaluation [98/100]\n",
      "INFO:tensorflow:Evaluation [99/100]\n",
      "INFO:tensorflow:Evaluation [100/100]\n",
      "INFO:tensorflow:Finished evaluation at 2018-03-05-18:25:48\n",
      "INFO:tensorflow:Saving dict for global step 1001: global_step = 1001, loss = 5.03793, metrics-translate_enfr_wmt32k/accuracy = 0.13121094, metrics-translate_enfr_wmt32k/accuracy_per_sequence = 0.0, metrics-translate_enfr_wmt32k/accuracy_top5 = 0.3142502, metrics-translate_enfr_wmt32k/approx_bleu_score = 0.021571739, metrics-translate_enfr_wmt32k/neg_log_perplexity = -5.8045664, metrics-translate_enfr_wmt32k/rouge_2_fscore = 0.08114186, metrics-translate_enfr_wmt32k/rouge_L_fscore = 0.34100863\n",
      "INFO:tensorflow:Validation (step 2000): loss = 5.03793, global_step = 1001, metrics-translate_enfr_wmt32k/neg_log_perplexity = -5.8045664, metrics-translate_enfr_wmt32k/accuracy = 0.13121094, metrics-translate_enfr_wmt32k/approx_bleu_score = 0.021571739, metrics-translate_enfr_wmt32k/rouge_2_fscore = 0.08114186, metrics-translate_enfr_wmt32k/accuracy_per_sequence = 0.0, metrics-translate_enfr_wmt32k/rouge_L_fscore = 0.34100863, metrics-translate_enfr_wmt32k/accuracy_top5 = 0.3142502\n",
      "INFO:tensorflow:Reading data files from /data.nfs/awolf/translation/enfr_rev_train_data/translate_fr_en_wmt32k/translate_enfr_wmt32k-dev*\n",
      "INFO:tensorflow:partition: 0 num_data_files: 1\n",
      "INFO:tensorflow:Setting T2TModel mode to 'eval'\n",
      "INFO:tensorflow:Setting hparams.layer_prepostprocess_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.symbol_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.attention_dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.dropout to 0.0\n",
      "INFO:tensorflow:Setting hparams.relu_dropout to 0.0\n",
      "INFO:tensorflow:Using variable initializer: uniform_unit_scaling\n",
      "INFO:tensorflow:Transforming feature 'inputs' with symbol_modality_27664_512.bottom\n",
      "INFO:tensorflow:Transforming 'targets' with symbol_modality_27664_512.targets_bottom\n",
      "INFO:tensorflow:Building model body\n",
      "INFO:tensorflow:Transforming body output with symbol_modality_27664_512.top\n",
      "INFO:tensorflow:Starting evaluation at 2018-03-05-18:25:57\n",
      "INFO:tensorflow:Restoring parameters from /data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train/model.ckpt-1001\n",
      "INFO:tensorflow:Evaluation [1/100]\n",
      "INFO:tensorflow:Evaluation [2/100]\n",
      "INFO:tensorflow:Evaluation [3/100]\n",
      "INFO:tensorflow:Evaluation [4/100]\n",
      "INFO:tensorflow:Evaluation [5/100]\n",
      "INFO:tensorflow:Evaluation [6/100]\n",
      "INFO:tensorflow:Evaluation [7/100]\n",
      "INFO:tensorflow:Evaluation [8/100]\n",
      "INFO:tensorflow:Evaluation [9/100]\n",
      "INFO:tensorflow:Evaluation [10/100]\n",
      "INFO:tensorflow:Evaluation [11/100]\n",
      "INFO:tensorflow:Evaluation [12/100]\n",
      "INFO:tensorflow:Evaluation [13/100]\n",
      "INFO:tensorflow:Evaluation [14/100]\n",
      "INFO:tensorflow:Evaluation [15/100]\n",
      "INFO:tensorflow:Evaluation [16/100]\n",
      "INFO:tensorflow:Evaluation [17/100]\n",
      "INFO:tensorflow:Evaluation [18/100]\n",
      "INFO:tensorflow:Evaluation [19/100]\n",
      "INFO:tensorflow:Evaluation [20/100]\n",
      "INFO:tensorflow:Evaluation [21/100]\n",
      "INFO:tensorflow:Evaluation [22/100]\n",
      "INFO:tensorflow:Evaluation [23/100]\n",
      "INFO:tensorflow:Evaluation [24/100]\n",
      "INFO:tensorflow:Evaluation [25/100]\n",
      "INFO:tensorflow:Evaluation [26/100]\n",
      "INFO:tensorflow:Evaluation [27/100]\n",
      "INFO:tensorflow:Evaluation [28/100]\n",
      "INFO:tensorflow:Evaluation [29/100]\n",
      "INFO:tensorflow:Evaluation [30/100]\n",
      "INFO:tensorflow:Evaluation [31/100]\n",
      "INFO:tensorflow:Evaluation [32/100]\n",
      "INFO:tensorflow:Evaluation [33/100]\n",
      "INFO:tensorflow:Evaluation [34/100]\n",
      "INFO:tensorflow:Evaluation [35/100]\n",
      "INFO:tensorflow:Evaluation [36/100]\n",
      "INFO:tensorflow:Evaluation [37/100]\n",
      "INFO:tensorflow:Evaluation [38/100]\n",
      "INFO:tensorflow:Evaluation [39/100]\n",
      "INFO:tensorflow:Evaluation [40/100]\n",
      "INFO:tensorflow:Evaluation [41/100]\n",
      "INFO:tensorflow:Evaluation [42/100]\n",
      "INFO:tensorflow:Evaluation [43/100]\n",
      "INFO:tensorflow:Evaluation [44/100]\n",
      "INFO:tensorflow:Evaluation [45/100]\n",
      "INFO:tensorflow:Evaluation [46/100]\n",
      "INFO:tensorflow:Evaluation [47/100]\n",
      "INFO:tensorflow:Evaluation [48/100]\n",
      "INFO:tensorflow:Evaluation [49/100]\n",
      "INFO:tensorflow:Evaluation [50/100]\n",
      "INFO:tensorflow:Evaluation [51/100]\n",
      "INFO:tensorflow:Evaluation [52/100]\n",
      "INFO:tensorflow:Evaluation [53/100]\n",
      "INFO:tensorflow:Evaluation [54/100]\n",
      "INFO:tensorflow:Evaluation [55/100]\n",
      "INFO:tensorflow:Evaluation [56/100]\n",
      "INFO:tensorflow:Evaluation [57/100]\n",
      "INFO:tensorflow:Evaluation [58/100]\n",
      "INFO:tensorflow:Evaluation [59/100]\n",
      "INFO:tensorflow:Evaluation [60/100]\n",
      "INFO:tensorflow:Evaluation [61/100]\n",
      "INFO:tensorflow:Evaluation [62/100]\n",
      "INFO:tensorflow:Evaluation [63/100]\n",
      "INFO:tensorflow:Evaluation [64/100]\n",
      "INFO:tensorflow:Evaluation [65/100]\n",
      "INFO:tensorflow:Evaluation [66/100]\n",
      "INFO:tensorflow:Evaluation [67/100]\n",
      "INFO:tensorflow:Evaluation [68/100]\n",
      "INFO:tensorflow:Evaluation [69/100]\n",
      "INFO:tensorflow:Evaluation [70/100]\n",
      "INFO:tensorflow:Evaluation [71/100]\n",
      "INFO:tensorflow:Evaluation [72/100]\n",
      "INFO:tensorflow:Evaluation [73/100]\n",
      "INFO:tensorflow:Evaluation [74/100]\n",
      "INFO:tensorflow:Evaluation [75/100]\n",
      "INFO:tensorflow:Evaluation [76/100]\n",
      "INFO:tensorflow:Evaluation [77/100]\n",
      "INFO:tensorflow:Evaluation [78/100]\n",
      "INFO:tensorflow:Evaluation [79/100]\n",
      "INFO:tensorflow:Evaluation [80/100]\n",
      "INFO:tensorflow:Evaluation [81/100]\n",
      "INFO:tensorflow:Evaluation [82/100]\n",
      "INFO:tensorflow:Evaluation [83/100]\n",
      "INFO:tensorflow:Evaluation [84/100]\n",
      "INFO:tensorflow:Evaluation [85/100]\n",
      "INFO:tensorflow:Evaluation [86/100]\n",
      "INFO:tensorflow:Evaluation [87/100]\n",
      "INFO:tensorflow:Evaluation [88/100]\n",
      "INFO:tensorflow:Evaluation [89/100]\n",
      "INFO:tensorflow:Evaluation [90/100]\n",
      "INFO:tensorflow:Evaluation [91/100]\n",
      "INFO:tensorflow:Evaluation [92/100]\n",
      "INFO:tensorflow:Evaluation [93/100]\n",
      "INFO:tensorflow:Evaluation [94/100]\n",
      "INFO:tensorflow:Evaluation [95/100]\n",
      "INFO:tensorflow:Evaluation [96/100]\n",
      "INFO:tensorflow:Evaluation [97/100]\n",
      "INFO:tensorflow:Evaluation [98/100]\n",
      "INFO:tensorflow:Evaluation [99/100]\n",
      "INFO:tensorflow:Evaluation [100/100]\n",
      "INFO:tensorflow:Finished evaluation at 2018-03-05-18:26:17\n",
      "INFO:tensorflow:Saving dict for global step 1001: global_step = 1001, loss = 5.0379305, metrics-translate_enfr_wmt32k/accuracy = 0.13121094, metrics-translate_enfr_wmt32k/accuracy_per_sequence = 0.0, metrics-translate_enfr_wmt32k/accuracy_top5 = 0.3142502, metrics-translate_enfr_wmt32k/approx_bleu_score = 0.021571739, metrics-translate_enfr_wmt32k/neg_log_perplexity = -5.8045664, metrics-translate_enfr_wmt32k/rouge_2_fscore = 0.08114186, metrics-translate_enfr_wmt32k/rouge_L_fscore = 0.34100863\n",
      "INFO:tensorflow:Validation (step 2000): loss = 5.0379305, global_step = 1001, metrics-translate_enfr_wmt32k/neg_log_perplexity = -5.8045664, metrics-translate_enfr_wmt32k/accuracy = 0.13121094, metrics-translate_enfr_wmt32k/approx_bleu_score = 0.021571739, metrics-translate_enfr_wmt32k/rouge_2_fscore = 0.08114186, metrics-translate_enfr_wmt32k/accuracy_per_sequence = 0.0, metrics-translate_enfr_wmt32k/rouge_L_fscore = 0.34100863, metrics-translate_enfr_wmt32k/accuracy_top5 = 0.3142502\n",
      "INFO:tensorflow:Saving checkpoints for 2001 into /data.nfs/awolf/translation/models/translate_enfr_wmt32k_rev/transformer_train/model.ckpt.\n",
      "INFO:tensorflow:global_step/sec: 1.05547\n",
      "INFO:tensorflow:loss = 4.3901515, step = 2001 (94.746 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.75414\n",
      "INFO:tensorflow:loss = 3.9306803, step = 2101 (26.636 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.76246\n",
      "INFO:tensorflow:loss = 4.335381, step = 2201 (26.578 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.79176\n",
      "INFO:tensorflow:loss = 4.32332, step = 2301 (26.373 sec)\n",
      "INFO:tensorflow:global_step/sec: 3.78721\n",
      "INFO:tensorflow:loss = 3.902763, step = 2401 (26.405 sec)\n"
     ]
    }
   ],
   "source": [
    "exp_fn = create_experiment(\n",
    "        run_config=RUN_CONFIG,\n",
    "        hparams=hparams,\n",
    "        model_name=PARAMS['T2T_Model'],\n",
    "        problem_name=PARAMS['T2T_Problem'],\n",
    "        data_dir=PARAMS['DATA_DIR'],\n",
    "        train_steps=PARAMS['train_steps'],\n",
    "        eval_steps=PARAMS['eval_steps']\n",
    "    )\n",
    "exp_fn.train_and_evaluate()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "creator": "admin",
  "kernelspec": {
   "display_name": "Python (env Tensor2Tensor_GPU)",
   "language": "python",
   "name": "py-dku-venv-tensor2tensor_gpu"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
